Performance Analysis of Branch Prediction Unit for Pipelined Processors
نویسندگان
چکیده
The branch predictor plays a crucial role in the achievement of effective performance in microprocessors with pipelined architectures. This paper analyzes performance of branch prediction unit for pipelined processors. A memory of 512 bytes is designed for storing instructions. A 32 byte memory is designed for branch target buffer (BTB). This memory is utilized for storing history of the branch instructions. A Finite State Machine (FSM) is designed for branch predictor unit. It consists of four states: strongly taken, weakly taken, weakly not taken and strongly not taken. Prediction is done based on the status of FSM. If the state of FSM is weakly taken or strongly taken, then predictor guesses it as a taken condition else it is assumed to be not taken condition. When the execution of branch instruction is done for the first time the BTB stores the address of current instruction and also the address where it jumps. After this the current status of the FSM is updated accordingly. The program is executed using a branch predictor unit and also without a branch predictor unit. The latency of both the processors with a branch prediction unit and without is branch prediction unit is computed and compared. The simulation results validates that with branch prediction unit latency is decreased.
منابع مشابه
Analysis of Combined Bimodal and GShare Branch Prediction Schemes
Taking advantage of instruction level parallelism is one of the key factors in improving computer performance. The main performance inhibitor in a pipelined processor is the branch penalty. Many techniques have been developed to reduce this penalty. The technique of branch prediction was explored further. Branch predictors are crucial in today's modern, superscalar processors for achieving high...
متن کاملNew Algorithm Improves Branch Prediction: 3/27/95
Intel’s P6 processor (see 090202.PDF) is the first to use a two-level branch-prediction algorithm to improve accuracy. This algorithm, first published by Tse-Yu Yeh and Yale Patt, has the potential to push accuracy well beyond the 90% level achieved by the best processors today. As future processors look to improve performance by increasing the issue rate and/or extending the pipeline depth, th...
متن کاملSpeculative Branch Folding for Pipelined Processors
This paper proposes an effective branch folding technique which combines branch instructions with predicted instructions. This technique can be implemented using an instruction queue, which buffers prefetched instructions. Most of the instructions in the instruction queue are forwarded to the execution unit in sequence. Branch instructions, however, are combined with predicted instructions in t...
متن کاملA latency-conscious SMT branch prediction architecture
Executing multiple threads has proved to be an effective solution to partially hide latencies that appear in a processor. When a thread is stalled because a long-latency operation is being processed, like a memory access or a floatingpoint calculation, the processor can switch to another context so that another thread can take advantage of the idle resources. However, fetch stall conditions cau...
متن کاملSpeculative Updates of Local and Global Branch History: A Quantitative Analysis
In today’s wide-issue processors, even small branch-misprediction rates introduce substantial performance penalties. Worse yet, inadequate branch prediction creates a bottleneck at the fetch stage, restricting other opportunities for improving performance. The choice of how to predict conditional-branch outcomes is the primary lever on prediction accuracy. But the choice of when to update the p...
متن کامل